Setting an Optimal α That Minimizes Errors in Null Hypothesis Significance Tests

نویسندگان

  • Joseph F. Mudge
  • Leanne F. Baker
  • Christopher B. Edge
  • Jeff E. Houlahan
چکیده

Null hypothesis significance testing has been under attack in recent years, partly owing to the arbitrary nature of setting α (the decision-making threshold and probability of Type I error) at a constant value, usually 0.05. If the goal of null hypothesis testing is to present conclusions in which we have the highest possible confidence, then the only logical decision-making threshold is the value that minimizes the probability (or occasionally, cost) of making errors. Setting α to minimize the combination of Type I and Type II error at a critical effect size can easily be accomplished for traditional statistical tests by calculating the α associated with the minimum average of α and β at the critical effect size. This technique also has the flexibility to incorporate prior probabilities of null and alternate hypotheses and/or relative costs of Type I and Type II errors, if known. Using an optimal α results in stronger scientific inferences because it estimates and minimizes both Type I errors and relevant Type II errors for a test. It also results in greater transparency concerning assumptions about relevant effect size(s) and the relative costs of Type I and II errors. By contrast, the use of α = 0.05 results in arbitrary decisions about what effect sizes will likely be considered significant, if real, and results in arbitrary amounts of Type II error for meaningful potential effect sizes. We cannot identify a rationale for continuing to arbitrarily use α = 0.05 for null hypothesis significance tests in any field, when it is possible to determine an optimal α.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

TreeFix: Statistically Informed Gene Tree Error Correction using Species Trees – Supplementary Material

In our discussion of hypothesis testing, we said that trees are statistically equivalent if p ≥ α. However, strictly speaking, failing to reject the null hypothesis does not imply that the null hypothesis is true. For example, it could be that enough variability exists in the sequence information to mask the differences in the statistical support of different topologies. We must therefore also ...

متن کامل

False Discovery Rates

In hypothesis testing, statistical significance is typically based on calculations involving p-values and Type I error rates. A p-value calculated from a single statistical hypothesis test can be used to determine whether there is statistically significant evidence against the null hypothesis. The upper threshold applied to the p-value in making this determination (often 5% in the scientific li...

متن کامل

Design of the Fuzzy Rank Tests Package

denote the critical function of a randomized test having significance level α and point null hypothesis θ, that is, the randomized test rejects the null hypothesis θ = θ0 at level α when the observed data are x with probability φ(x, α, θ0). The requirement that φ(x, α, θ) be a probability restricts it to being between zero and one (inclusive). The requirement that the test have its nominal leve...

متن کامل

The Optimal Discovery Procedure: A New Approach to Simultaneous Significance Testing

Significance testing is one of the main objectives of statistics. The NeymanPearson lemma provides a simple rule for optimally testing a single hypothesis when the null and alternative distributions are known. This result has played a major role in the development of significance testing strategies that are used in practice. Most of the work extending single testing strategies to multiple tests...

متن کامل

Guidelines for Multiple Testing in Impact Evaluations of Educational Interventions

A. INTRODUCTION Studies that examine the impacts of education interventions on key student, teacher, and school outcomes typically collect data on large samples and on many outcomes. In analyzing these data, researchers typically conduct multiple hypothesis tests to address key impact evaluation questions. Tests are conducted to assess intervention effects for multiple outcomes, for multiple su...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 7  شماره 

صفحات  -

تاریخ انتشار 2012